Superset Learning Based on Generalized Loss Minimization
نویسندگان
چکیده
In standard supervised learning, each training instance is associated with an outcome from a corresponding output space (e.g., a class label in classification or a real number in regression). In the superset learning problem, the outcome is only characterized in terms of a superset—a subset of candidates that covers the true outcome but may also contain additional ones. Thus, superset learning can be seen as a specific type of weakly supervised learning, in which training examples are ambiguous. In this paper, we introduce a generic approach to superset learning, which is motivated by the idea of performing model identification and “data disambiguation” simultaneously. This idea is realized by means of a generalized risk minimization approach, using an extended loss function that compares precise predictions with set-valued observations. As an illustration, we instantiate our meta learning technique for the problem of label ranking, in which the output space consists of all permutations of a fixed set of items. The label ranking method thus obtained is compared to existing approaches tackling the same problem.
منابع مشابه
Classification Methods with Reject Option Based on Convex Risk Minimization
In this paper, we investigate the problem of binary classification with a reject option in which one can withhold the decision of classifying an observation at a cost lower than that of misclassification. Since the natural loss function is non-convex so that empirical risk minimization easily becomes infeasible, the paper proposes minimizing convex risks based on surrogate convex loss functions...
متن کاملLearning from Imprecise and Fuzzy Observations: Data Disambiguation through Generalized Loss Minimization
Methods for analyzing or learning from “fuzzy data” have attracted increasing attention in recent years. In many cases, however, existing methods (for precise, non-fuzzy data) are extended to the fuzzy case in an ad-hoc manner, and without carefully considering the interpretation of a fuzzy set when being used for modeling data. Distinguishing between an ontic and an epistemic interpretation of...
متن کاملOptimal Capacitor Allocation in Radial Distribution Networks for Annual Costs Minimization Using Hybrid PSO and Sequential Power Loss Index Based Method
In the most recent heuristic methods, the high potential buses for capacitor placement are initially identified and ranked using loss sensitivity factors (LSFs) or power loss index (PLI). These factors or indices help to reduce the search space of the optimization procedure, but they may not always indicate the appropriate placement of capacitors. This paper proposes an efficient approach for t...
متن کاملA Nearest Neighbor Approach to Label Ranking based on Generalized Labelwise Loss Minimization
In this paper, we introduce a new (meta) learning technique for a preference learning problem called label ranking. As opposed to existing meta techniques, which mostly decompose the original problem into pairwise comparisons, our approach relies on a labelwise decomposition. The basic idea is to train one model per class label, namely a model that maps instances to ranks. We propose a concrete...
متن کاملRobust Unsupervised Clustering Using Generalized Annealing M-estimator
A new robust clustering algorithm, called generalized annealing M-estimator (GAM-estimator), is proposed. Initialized with multiple seeds, the GAM-estimator converges to several optimal cluster centers. Neither knowledge about the number of clusters nor scale is needed. The global optimal solution of clustering is achieved by minimization of an objective function. The algorithm is applied to un...
متن کامل